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Abstract 


This paper describes efficient computational procedures for 
calculating all possible 2“ regressions of a dependent variable 
upon subsets of k independent variables, The principal result 

of the paper is contained in theorem 1, which provides a con- 
structive proof that all 2% possible regressions can be accom- 
plished by "sweeping" the cross-products matrix exactly 2* times. 


It is shown also that further economies in computation can be 
achieved by taking advantage of a general symmetry property of 
the cross-products matrix at each stage, and applying the sweep 
operation itself to a minimal sized submatrix at each step. The 
paper provides an example involving four independent variables, 
as well as a Fortran routine for generating an optimal sequence 
of sweeps. 
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I, INTRODUCTION 


Numerous procedures have been proposed for the selection of a 
subset of k independent variables in fitting a multiple regression equation 
to a set of data. Included among these are various forward and backward 
stepwise selection techniques (Efroymson 1960, Hamaker 1962, Oosterhoff 
1963) and the Cc. statistic proposed by C. Mallows (1964.) Theoretically 
speaking, none of these methods can claim to achieve optimality, so that 
the sure way of finding that regression which is best according to some 
criterion is to carry out all possible 2k regressions and use that criteri- 
on to select.the "best" of the 2* possible regressions. Because the 
number of calculations increases exponentially with k, it is particularly 
important to have an efficient algorithm for carrying out the necessary 
computations. However, if k is sufficiently large, it may not be 
practical to carry out all of the possible regressions, In such cases, 
Gorman and Toman (1966) have suggested the use of fractional factorial 
designs for selecting a subset of the a regressions, They then use the 
Cy statistic as a criterion for selecting the’ best" regression. In this 
situation as well, it is important to carry out the computations efficiently 
and to extract as much information as possible from the data. In this 
paper, we describe an efficient way of fitting all possible regressions, and 


offer some comments on the fractional factorial case, 


2m 

The calculations required to fit a regression equation to a given set 
of data are essentially those of solving a set of simultaneous linear equations. 
A commonly used, and usually efficient direct method of carrying out such 
calculations is the Gaussian elimination (or pivotal inversion) method. 
(Wilkinson, 1965). 

In section 2 of this paper we describe a variant of the usual Gaussian 
elimination method, hereinafter referred to as sweep (Beaton 1964), and show 
how it can be used to add and delete independent variables from a fitted 
regression. We then go on to show, in section 3, how to carry out all possible 
regressions efficiently. An example of the procedure is given in section 4. 

In section 5, we offer some comments about the problem of selecting 
balanced fractions of all the possible regressions, as proposed by Gorman and 


Toman. 


Il, APPLICATION OF SWEEPING TO MULTIPLE REGRESSION 
Following Beaton (1964), a square matrix A = (a,,) is said to have 
4 
been swept on the ath row and column (or th pivotal element) when it has 


been transformed into a matrix B = (b,,) such that 


bey = Magy 
biy = -8jr/arr ifr 
(2.1) 
b,j = ayj/apy j#r 
bi By r Byy Ayj/apy i,j #r 


It is easy to check from (2.1) that the sweep operator possesses the 
following useful properties: 
1. Sweep is reversible. 
That is, sweeping a matrix twice on the same row and column 
is equivalent to not having swept the mm trix at all. 
2. Sweep is commutative. 
That is, sweeping a matrix first on the rth and then on the 
sth pivotal element is equivalent to sweeping the matrix in 
the opposite order. 
In terms of regression analysis, consider the normal equations of regres- 


sion theory in their matrix representation 


2.2 x'xB = x! 
( 2 ) n~ wa nw wv x 
A 
x rae Y B 
1 ° 
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If the cross product matrix 
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is swept on the first k+l pivotal elements, then provided that (x'xy7 
vs 


exists, the result will be 
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A A 
where B and o” are the least squares estimates for the regression co- 
lad 


efficients and error variance respectively. In general, sweeping © on any 
subset of the first k+l pivotal elements will yield the estimates of the 
regression coefficients and error variance corresponding to the regression of 
Y on that subset of the X's. Thus, by the reversibility and commutativity 
properties of sweep, each application of sweep to a particular row and column 
of the cross-products matrix either introduces the variable corresponding to 
that row and column into the fitted regression equation, or removes it if it 
was already in the equation. This property of sweep suggests that it would 


provide an efficient method for calculating all regressions. 


NI. DOING ALL POSSIBLE REGRESSIONS 

In using the sweeping method to do all possible regressions, computational 
efficiency can be achieved in two ways. First, it is important to obtain the ak 
regressions with a minimum number of sweeps. Having achieved this, we 
would like to carry out each sweep as efficiently as possible. In theorem 1 
below, we give a constructive proof that the 2k possible regressions can be 
carried out in a sequence of just 2k sweeps, which is of course the minimum 


- number of required sweeps. We then go on to show how to effect a significant 


a 





reduction in the number of calculations needed for each sweep in the sequence. 
These economies are brought about by sweeping the smallest possible sub- 
matrix at each stage and using the inherent symmetry properties of the 
matrix in such a way as to operate only on the upper triangular part of the 


submatrix. 


THEOREM 1. All possible 2* regressions of a dependent variable ona 
set of k independent variables can be obtained through a sequence of 
exactly af sweeps of the (k+2) x (k+2) cross product matrix & 
Proof: (By mathematical induction) 

First, sweep Ss on the zeroth pivotal element to produce the 
regression = Bo - Denote the resulting matrix by put We proceed 
to show that the additional ak 1 regressions can be obtained with 
exactly ae 1 sweeps of W. 

When k=l, 2k 121 and the single ceunewnton” = §+4,x, is 
performed by sweeping on the first pivotal element of wW. 

Assume that the ara possible regressions on k-1 variables can 
be obtained in a sequence S, 1 of exactly ae -l sweeps on the first k 
pivotal elements. Now, sweeping on the (k,k) pivotal element will 
produce a new regression which adds the th variable to the regression 
produced by the sequence Shea Since the dvienes S,.1 produced 
2k-1 distinct regressions not including the kth yariable, repetition of the 


sequence S will now produce another ge distinct regressions inclu- 


k-1 
ding the ,th variable. The total number of sweeps in S. = (S,, -p® Sx-1) 


k 


is thus seen to be (2-1 -l)+ 1+ (2k-4 -1) = 2° -1. Together with the sweep 


which produced W and the regression Y=B,, we see that all possible 2* distinct 


regressions will have been produced in exactly 2h sweeps, 


The proof suggests a recursive algorithm for constructing the sequence 


Sy that is 


k-1’ 


key Shay ), as illustrated in Table 1. 


TABLE I. 


1213121 


PAS Be ZN as a aD 3 Zee 


> eat 
33 . S3 


(Sy 1,1, Si) 


However, it is not actually necessary to construct the sequence 


recursively, since the recursion formula for Sx reveals that the i 


th pivotal 


jst 


element (i= 1,2,...,k) is swept for the first time on the Din . sweep 


and is swept on every 2° th sweep thereafter. Table 2 provides a simple Fortran 


program for generating S. 


k directly by the above considerations. 


Fortran Routine for Generating Sequence of All Regressions 


TABLE 2 
k = no. of variables 
PI = no. of possible regressions (in addition to the regression 
A 
$= 8) 
input k 


output I = vector of length | containing sequence of sweeps 


SUBROUTINE REGR (K, ]) 


DIMENSION I (2048) 

DO1J=1,K 

JJ = 2 ** (J-1) 
_INCR = 2* JJ 

I (JJ) =I 

JIJ = 2 **(K-J) -1 

IF (JJJ) 1,1,3 

DO 2 M=1,JJJ 

JJ=IJ + INCR 

(JJ) = 5 

CONTINUE 

RETURN 


END 





Having obtained an optimum sequence of sweeps, we now turn 
to the problem of carrying out each sweep as efficiently as possible. We 
note from (2.1) that the symmetric property of the cross product matrix 
is partially destroyed by the sweep operation. That is, when a symmetric 
matrix A is swept on the rth pivotal element, the resulting matrix B 
has the property be =-b_., and ar = By for i,j#r. This particular 
property of sweep gives rise to the definition of an absolute symmetric 
matrix Ibjl=lb,,! for all i and j. 


In general, the relation between By and Bis may be found 


th h 


readily if we know how many times A has been swept on the i’ and jt 
pivotal elements. Define a parity vector T = (to tyreees ty) where 


t =1 if the matrix A has been swept an even number of times on the job 
pivotal element and t = - 1 otherwise. The property of absolute sym- 
metry, combined with the parity vector T permits us to sweep only the 


upper triangular part of the matrix at each step, where the sweep operation(2» 1) 


is redefined by (3.1) below. 


by * Wins 

b,. = -aj,/a. igr 

Pg = Hl ®, ome 

ar = a - a, he j7i 


e = i 3 
where aay = tytyavu if u>v 


It should be noted that Lae 1 for all r initially (i.e. - for 


symmetric A). Each time the matrix is swept on the rth pivotal element, 


the sign of t. must be reversed. At any step in a sequence of sweeps, the 


od 


variable x. is included in the regression equation if ia = -l. 

It should be noted that theorem 1 gives a sequence of sweeps 
to be applied to the entire (k+2) x (k+2) cross product matrix Ss However, 
a further substantial savings in computing time can be achieved by applying 
the sweep operator only to that submatrix containing the relevant elements 
at the particular stage of the regression, 

In the case of sweeping the matrix on a pivotal element which 
results in adding a variable, say Xe to the regression equation, the 
submatrix to be swept must include all variables which are already in the 
equation, any variable which will be entered into the equation before x, is 
removed, and the dependent variable. When the variable x; is next 
deleted, the same submatrix must be swept. The procedure is summarized 
by the following simple rule, which gives the sequence of minimum sized 


submatrices associated with the successive sweeps. 


Rule 1: 


When sweeping the ith 


variable, the submatrix to be swept 
consists of the rows and columns with indices 0,1,2,..., 
itl, all j>i+t 1 such that t; = -l, and k+l (the index of 


the dependent variable). 


To summarize then, the recommended procedure for calculating 
all 2‘ regressions in a minimum number of computations is as follows: 
1. Carry out a sequence of-sweeps as given by theorem l. 
2. For each such sweep: 
a. Apply Rule 1 to obtain the minimum sized submatrix to 


“ee 
be swept. 


= 10° = 


b. Use (3.1) to apply the sweep operator only to the 
upper triangular part of the submatrix defined in 
2a above. 

In the next section, we will apply the above procedure to an 


example in which k=4. 





Sales 
Iv. A FOUR VARIABLE EXAMPLE 


The basis for the fundamental result of the previous section 
(i.e., Theorem 1) can be illustrated by figure 1, which gives a geometric 
representation of a four variable example in terms of travelling along 
edges of cubes, where the ith coordinate of a vertex indicates the presence 


(1) or absence (0) of the jth 


variable in the regression equation. Starting 
at (0,0,0,0) the arrows indicate the successive regressions, and the number 
alongside each arrow indicates the pivotal element to be swept in order 

to move from the regression represented by the starting vertex to that of 
the ending vertex. Thus, for example, the vertex (1,1,0,0) represents the 


A 
regression $ = 4 + 5, a +B _ » and sweeping variable x, will re- 


0 2 
move it from the regression, taking us to the vertex (0,1,0,0), which 
A a A 
represents the regression Y= B™ +B” X . Table 3 summarizes the se- 
0 2 2 
quence of sweeps, the resulting regressions, and the applicable submatrices 


corresponding to the sweep sequence depicted in figure 1. 


i , 4 . is 
Geometric Representation of 2° Regressions in 24 Sweeps 


Givly Ld 






1 A(o,!,00). — — (0, |, 1,1) 


4 \ 
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KEY: 1. The it® coordinate of each vertex indicates the presence (1) or 
absence (0) of the ith variable in the regression, 


2. Starting at (0,0,0,0)» arrows indicate successive regressions, 
and the number alongside each arrow indicates the pivotal element 
to be swept. 
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SEQUENCE OF SWEEPS AND CORRESPONDING REGRESSIONS 


Order of 
Sweep 


OnDNAOLPWNeE 


Sequence of 


Sweeps (S;,) 


BNF WOE NY RENKYEWHENHYO 


Vertex 


(0, 0, 0, 0) 
(1,0, 0, 0) 
(1, 1, 0, 0) 
(0, 1, 0, 0) 
(0, 1, 1, 0) 
(1, 1, 1, 0) 
(1,0, 1, 0) 
(0, 0, 1, 0) 
(0,0, 1, 1) 
(1,0, 1, 1) 
(1,1, 1, 1) 
(0,1, 1, 1) 
(0, 1,0, 1) 
ql, 1, 0, 1) 
(1,0, 0, 1) 
(0,0, 0, 1) 


Table 


Independent variables 
included inRegression 


none 


Submatrix 


to be swept 
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V. SOME COMMENTS ON FRACTIONAL REPLICATION 


If the number of independent variables under consideration is 
sufficiently large, it may be impractical to calculate all possible regressions. 
However, in such instances, it may be possible to identify the important variables by 
restricting the search to a subset of the possible regressions. . 

Gorman and Toman (1966) have proposed a procedure involving 
calculation of only a balanced fraction of the possible regressions, and use 
of the Cp statistic to select one of these. We shall, in this section, investi- 
gate the use of this procedure in terms of the sweep operation. 

With reference to figure 1, if XX, X4X4 is chosen as the de- 
fining contrast for a 1/2 replicate, we obtain the regressions corresponding 


to the circled vertices as listed in Table 4. 


Order of Sequence of Vertex Treatment Combination (independent 
Sweeps Sweeps Variables Included in Regression) 
1 0 (0, 0, 0, 0) none 

(rel 1,2 (1, 1, 0, 0) 132 

4,5 1,3 (0, 1, 1, 0) eee) 

6,7 1,2 (1, 0, 1, 0) 1,3 

8,9 1,4 (0, 0, 1, 1) 3, 4 

10,11 1,2 (1,1, 1, 1) 1,2, 3,4 

12,13 1,3 (0, 1,0, 1) 2,4 

14, 15 1,2 (1, 0,0, 1) 1,4 

Table 4. 


It is immediately apparent from the above listing of treatment 
combinations that at least two sweeps are required to move from any vertex 
in the design to any other vertex in the design. Thus, the computation re- 


quired to find the regressions corresponding to a half replicate of all the 
“ee 


= [4 Js 


possible regressions will actually produce all possible regressions. (In 
this example, one additional sweep is required in order to reach (0, 0,0, 1) 
but if the other half-replicate had been selected, this sweep would have 
been required anyhow.) It is readily seen that regardless of the number 
of variables, any balanced half replicate requires us, at least, to move 
along two edges of a cube to get from one point to another in the design. 
Similarly, in the one quarter replicate case, one must again travel along 
at least three edges of a cube before reaching another admissible vertex. 
Thus, in general, at least 2k-ptl sweeps will be required to carry outa 
balanced 2"P fraction of the possible 2k regressions, 

The situation may be summarized as follows: 

1. In using a sequence of sweep operations to calculate a 
subset of the possible regressions, it is meaningless to 
consider a half-replicate since the required computations 
will automatically yield the entire as regressions. 

oa The computations required to carry out a quarter- 
replicate, automatically yield an additional (unbalanced) 
quarter-replicate. 

3. In carrying out a g02 fraction of all the possible re- 
gressions, it does not cost anything to look at the additional 
regressions which are automatically produced in the course 
of moving from one regression to another. 

The authors do not know of any method for generating balanced 
fractions of all regressions in an optimal fashion. Since as noted earlier, 
the number of possible regressions increases exponentially with k, the 
number of independent variables, a solution to the above problem, if it can 


be obtained, would constitute a valuable contribution. 
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